The violent crime rate in U.S increased by 3.4 percent nationwide in 2016 in US. As an international student, as well as a New Yorker, the public safety in NYC is always a concern to us, especially after the recent terrorists attack near the World Trade Center. Thus, our group decided to make a deeper investigation of the crime data and seek out some underlying reasons which led to the increase of crime rate.
NYPD official website provides citywide histroic crime data in forms of excel. We downloaded these datasets and merged them into the nyc_crime_hist. The resulting data frame contain information about the total number of offenses from 2000 to 2016 and major offense categories(felony, misdemeanor, and violation) and detailed descriptions.
We focus our efforts on the data of current year 2017 which is obtained from NYC_OpenData. It includes all valid felony, misdemeanor, and violation crimes reported to the NYPD till October in this year. The latest update of this dataset is October 25, 2017.
Since the dataset has 341716, 9 observations, we randomly sample 50000 observations to creat an interactive map showing locations where the crimes in New York City occured.
In future shiny app, we intend to add widgets that will limit the data size. The map will show crimes happening during a specfic date range or a specific boro.
The historic data shows the trend of crimes per month from 2000 to 2016. The crime number per month is calculated by dividing total crime number by 12 months (9 for 2017 since the data of 2017 is not complete yet). The results indicates that the prevalence of Misdemeanor is significantly higher than Violation and Felony.
We can see that overall the crime number per month are decreasing since 2000. However, misdemeanor crimes increased from 2005 to 2010, and dropped again after 2010.
Next, we made a plot showing the distribution of crime counts in 2017 based each months and grouped by boro. Overall, the crime counts keep at a stable level through this year except for the slightly fluctuation in the crime of Misdemeanor.
Here, we would like to make a deeper investigation about the crime numbers and crime rate based on each month this year. In order to calculate the crime rate, we need to use the population data of NYC. We get this data from the website. We can see from the results that Brooklyn has the most crime numbers this year, but in crime rate, Bronx is the worst. Queens is relatively safer. Also, we could find that in February, there are usually fewer crimes, that’s probably because the weather in February is usually the coldest, and people tend to spend more time indoors. The decreased outgoing times helps to account for fewer crimes in cold season.
After analyzing the trend over year and month, a plot of crime count versus hour is also added. It clearly shows that crime usually happened during daytime. The peak of crimes is during 15:00 to 20:00. Between 3am and 8am, the crime numbers are relatively low.
In this part, we build a function to get the most prevalent crimes in each boro and illustrate the result.
Comments
Comments
For example, Bronx borough’s family median income is 35176 dollars, associated with a crime rate of 5336.878. That is, we expect approximately 5337 crime cases among every 100000 people in bronx. Similarly, Manhattan borough’s crime rate is 5184.963, which indicated that there are approximately 5185 crime cases among every 100000 people in Manhattan. In contrast, Family income ranged between 60000 dollars to 70000 dollars tends to have the lowerest crime rate. Taking Queens as an example, we expect only 2959 crime cases among every 100000 people, which has a crime rate of 2958.740.
Then, we studied deeply of the reasons behind the results. Why there’s a significant difference of crime rate between each borough?
As for bronx, we think the main reason is the situation of lower income. Bronx family has the lowest median income among all five boroughs. The cost of living in New York City is expensive. People tend to need money for living, food, housing etc. However, facing with this problem, a proportion of lower income people would do illegal thing to make money. That can explain the reason why bronx has the highest crime rate.
As for Manhattan, the situation can be a little different as the family median income reaches 75575 annually. Therefore, the main reason of extreme high crime rate cannot be low income. According to our studies, we found that Manhattan’s multiculturalism and multiracial can be a factor of high crime rate. Different people from different culture and background live in Manhattan. Sometimes, it is hard to get along with each other. As a result of it, conflicts may happen.
Bronx has the highest unemployment rate of 6.6%, associated with the highest crime rate of 5336.878.
In contrast, Queens has the lowest unemployment rate of 4.3%, associated with the lowest crime rate of 2958.740.
However, this positive corelationship between unemployment rate and crime rate fails to explain the case of Manhattan with the unemployment rate of 4.4% and crime rate of 5184.963.
We plot the top 10 words in of offense description:
The graph analyzes top 10 words showing in offense description. The most frequent one is larceny, which appears nearly 100000 times. Other frequent words including related, petit, assault, harrassment, etc. Most of them indicated the type of crime, which is consistent with what we expect.
We compare distinct words in offense type of violation and felony.
The above chart compares distinct words(that is, words that appear much more frequently in one group than the other) in offense type of violation and felony. We can see that larceny, robbery, burglary,etc., appear more frequently in offense description of felony crime, while harrassment, gambling, loitering appear more frequently in offense description of violation crime. In terms of the results, we can obtain a basic picture of the difference between felony and violation.
Our analysis in focusing on providing information about crimes in NYC. The map helps to visualize where crimes occured in nyc. The trend of crimes shows the change over years and months. It also demonstrates that crimes are more likely to happen during afternoon and evening instead of mid night and early morning. Looking at different boros, we analyzed the common crimes in each boro. Bronx relatively has more crimes while Queens relatively is safest boro in NYC. Among all the criminal type, misdemeanor is the most frequently one.
The information about crimes aroused our interest to ask what causes the difference in crime frequency. We collect the data about income and unemployment and analyze their association with crime rate. Since we only have 5 boros in nyc, there are not clear and strong evidence to show a significant relation.